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Grad students from other departments (Gov2001) 
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Send and respond to Email, Discussion, and Chat 
We'll assign you a suggested Study Group 
Browse archive of previous years’ emails (Note which now-famous 
scholar is asking the question!) 
In-ter-rupt me as often as necessary 
(Got a dumb question? Assume you are the smartest person in class 
and you eventually will be!) 
When are Gary's office hours? 
e (Big secret: The point of office hours is to prevent students from 
visiting at other times) 


e Come whenever you like; if you can't find me or I’m in a meeting, come 
back, talk to my assistant in the office next to me, or email any time 
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e A new field: 

mid-1930s: Experiments and random assignment 

1950s: The modern theory of inference 

In your lifetime: Modern causal inference 

Even more recently: Part of a monumental societal change, "big data", 

and the march of quantification through academic, professional, 

commercial, government and policy fields. 


e The number of new methods is increasing fast 

e Most important methods originate outside the discipline of statistics 
(random assignment, experimental design, survey research, machine 
learning, MCMC methods, ...). Statistics: abstracts, proves formal 
properties, generalizes, and distributes results back out. 
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and that historically, political methodologists have been trained in 
many different areas 

e The crossroads for other disciplines, and one of the best places to 
learn about methods broadly. 


Second largest APSA section (Valuable for the job market!) 
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graduate they will be old 


e We could teach you all the methods that might prove useful during 
your career, but when you graduate you will be old 
e Instead, we teach you the fundamentals, the underlying theory of 
inference, from which statistical models are developed: 
e We will reinvent existing methods by creating them from scratch. 
e We will learn: its easy to invent new methods too, when needed. 
e [he fundamentals help us pick up new methods created by others. 
ə This helps us separate the conventions from underlying statistical 
theory. (How to get an F in Econometrics: follow advice from 
Psychometrics. Works in reverse too, even when the foundations are 
identical.) 
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e.g.,: How to fit a line to a scatterplot? 





X 


e visually (tends to be principle components) 
ə a rule (least squares, least absolute deviations, etc.) 
e criteria (unbiasedness, efficiency, sufficiency, admissibility, etc.) 


e from a theory of inference, and for a substantive purpose (like causal 
estimation, prediction, etc.) 
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Statistical programming languages 


Knowledge 
Learned 


Canned packages 





(now) (later) 


Time 


e We'll use R — a free open source program, a commons, a movement 


e and an R program called Zelig (Imai, King, and Lau, 2006-14) which 
simplifies R and helps you up the steep slope fast (see j.mp/Zelig4) 
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e Now you know what a model is. (Its an abstraction.) 
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Now you know what a model is. (Its an abstraction.) 
Is this model true? 

Are models ever true or false? 

Are models ever realistic or not? 

Are models ever useful or not? 


Models of dirt on airplanes, vs models of aerodynamics 
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eYisnx]1l. 
e y;, a number (after we know it) 
e Y;, a random variable (before we know it) 
e Commonly misunderstood: a "dependent variable" can be 





e a column of numbers in your data set 
e the random variable for each unit i. 
e Explanatory variables 
e aka "covariates," "independent," or "exogenous" variables 
e X = {xj} is n x k (observations by variables) 
e A set of columns (variables): X = {x,..., xx} 
e Row (observation) i: x; = (xi... xix} 
e X is fixed (not random). 
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Is a histogram of y a test of normality? 
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where 
Y; random outcome variable 
f(-) probability density 
0; a systematic feature of the density that varies over i 
a ancillary parameter (feature of the density constant over /) 
g(-) functional form 
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Y; ~ f(0;, a) stochastic 
6; = g(Xi, D) systematic 


e Estimation uncertainty: Lack of knowledge of 8 and a. Vanishes as n 
gets larger. 

e Fundamental uncertainty: Represented by the stochastic component. 
Exists no matter what the researcher does; no matter how large n is. 


e (If you know the model, is R? = 1? Can you predict y perfectly?) 


> - = = Hace 
Gary King (Harvard) The Basics 18 / 61 


Systematic Components: Examples 








Gary King (Harvard) The Basics 


Systematic Components: Examples 





oE Y) = gy = Ae = Py Bie bo Spy 





Gary King (Harvard) The Basics 


Systematic Components: Examples 





e AY) = = XE = Ba + Bir te + y 


o Pr(Y; = 1) =; = owe 





Gary King (Harvard) The Basics 


Systematic Components: Examples 





e E(Y;) = ui = XiB = Bot Eriu tt BX 
ePiY;sijemje 
e V(Y;) 8 o2 — eX? 


1 
lre-5P 





Gary King (Harvard) The Basics 


Systematic Components: Examples 





e E(Y;) = uj = XiB = Bo + b1Xii +++: + Bk Xii 
e Pry; = as mes 
e V(Y;) =o? — eX? 

(B is an "effect parameter" vector in each, but the meaning differs.) 


1 
lre-5P 





Gary King (Harvard) The Basics 


Systematic Components: Examples 





e E(Y;) = ui = Xib = Bo + BJXai + +++ + BkXyi 
o Pr(Y; = 1) = mi = owe 
e V(Y;) 2 o? =e? 
(B is an "effect parameter" vector in each, but the meaning differs.) 


ra x 


9 o 





Gary King (Harvard) The Basics 


Systematic Components: Examples 





e E(Y;) = ui = Xib = Bo + BJXai + +++ + BkXyi 
e PHY DS = spam 


e V(Y;) 2 o? =e? 
(B is an "effect parameter" vector in each, but the meaning differs.) 


ra x 


9 [4] 


e Each mathematical form is a class of functional forms 





Gary King (Harvard) The Basics 


Systematic Components: Examples 





e E(Yi) = uj = XiB = Bo + ByXai + +++ + BuXki 
e Pr(Y; = 1) = m; = 
e V(Y;) 2 o? =e? 

(B is an "effect parameter" vector in each, but the meaning differs.) 


c 
le" 


ra x 


9 [4] 


e Each mathematical form is a class of functional forms 


e We choose a member of the class by setting 8 





Gary King (Harvard) The Basics 


Systematic Components: Examples 





e We (ultimately) will 





Gary King (Harvard) The Basics 


Systematic Components: Examples 





e We (ultimately) will 


e Assume a class of functional forms (each form is flexible and maps out 
many potential relationships) 





Gary King (Harvard) The Basics 


Systematic Components: Examples 





e We (ultimately) will 
e Assume a class of functional forms (each form is flexible and maps out 
many potential relationships) 
e Within the class, choose a member of the class by estimating 3 





Gary King (Harvard) The Basics 


Systematic Components: Examples 





e We (ultimately) will 


e Assume a class of functional forms (each form is flexible and maps out 
many potential relationships) 

e Within the class, choose a member of the class by estimating 3 

e Since data contain (sampling, measurement, random) error, we will be 
uncertain about: 


= = = 
n 


Gary King (Harvard) The Basics 20 / 61 


Systematic Components: Examples 





e We (ultimately) will 
e Assume a class of functional forms (each form is flexible and maps out 
many potential relationships) 
e Within the class, choose a member of the class by estimating 3 
e Since data contain (sampling, measurement, random) error, we will be 
uncertain about: 


e the member of the chosen family (sampling error) 


Gary King (Harvard) The Basics 20 / 61 


Systematic Components: Examples 





e We (ultimately) will 
e Assume a class of functional forms (each form is flexible and maps out 
many potential relationships) 
e Within the class, choose a member of the class by estimating 3 
e Since data contain (sampling, measurement, random) error, we will be 
uncertain about: 
e the member of the chosen family (sampling error) 
e the chosen family (model dependence) 


Gary King (Harvard) The Basics 20 / 61 


Systematic Components: Examples 





e We (ultimately) will 
e Assume a class of functional forms (each form is flexible and maps out 
many potential relationships) 
e Within the class, choose a member of the class by estimating 3 
e Since data contain (sampling, measurement, random) error, we will be 
uncertain about: 
e the member of the chosen family (sampling error) 
e the chosen family (model dependence) 


e If the true relationship falls outside the assumed class, we 


Gary King (Harvard) The Basics 20 / 61 


Systematic Components: Examples 





e We (ultimately) will 
e Assume a class of functional forms (each form is flexible and maps out 
many potential relationships) 
e Within the class, choose a member of the class by estimating 3 
e Since data contain (sampling, measurement, random) error, we will be 
uncertain about: 


e the member of the chosen family (sampling error) 
e the chosen family (model dependence) 


e If the true relationship falls outside the assumed class, we 
e Have specification error, and potentially bias 


Gary King (Harvard) The Basics 20 / 61 


Systematic Components: Examples 





e We (ultimately) will 
e Assume a class of functional forms (each form is flexible and maps out 
many potential relationships) 
e Within the class, choose a member of the class by estimating 3 
e Since data contain (sampling, measurement, random) error, we will be 
uncertain about: 
e the member of the chosen family (sampling error) 
e the chosen family (model dependence) 
e If the true relationship falls outside the assumed class, we 
e Have specification error, and potentially bias 
e Still get the best [linear,logit,etc] approximation to the correct 
functional form. 


Gary King (Harvard) The Basics 20 / 61 


Systematic Components: Examples 





e We (ultimately) will 
e Assume a class of functional forms (each form is flexible and maps out 
many potential relationships) 
e Within the class, choose a member of the class by estimating 3 
e Since data contain (sampling, measurement, random) error, we will be 
uncertain about: 
e the member of the chosen family (sampling error) 
e the chosen family (model dependence) 
e If the true relationship falls outside the assumed class, we 
e Have specification error, and potentially bias 
e Still get the best [linear,logit,etc] approximation to the correct 
functional form. 
e May be close or far from the truth: 


Gary King (Harvard) The Basics 20 / 61 


Systematic Components: Examples 





e We (ultimately) will 
e Assume a class of functional forms (each form is flexible and maps out 
many potential relationships) 
e Within the class, choose a member of the class by estimating 8 
e Since data contain (sampling, measurement, random) error, we will be 
uncertain about: 
e the member of the chosen family (sampling error) 
e the chosen family (model dependence) 
e If the true relationship falls outside the assumed class, we 
e Have specification error, and potentially bias 
e Still get the best [linear,logit,etc] approximation to the correct 
functional form. 
e May be close or far from the truth: 





best linear 





Gary King (Harvard) The Basics 20 / 61 


Overview of Stochastic Components 








Gary King (Harvard) The Basics 


Overview of Stochastic Components 





e Normal — continuous, unimodal, symmetric, unbounded 





Gary King (Harvard) The Basics 


Overview of Stochastic Components 





e Normal — continuous, unimodal, symmetric, unbounded 


e Log-normal — continuous, unimodal, skewed, bounded from below by 
zero 





Gary King (Harvard) The Basics 


Overview of Stochastic Components 





e Normal — continuous, unimodal, symmetric, unbounded 


e Log-normal — continuous, unimodal, skewed, bounded from below by 
zero 


e Bernoulli — discrete, binary outcomes 





Gary King (Harvard) The Basics 


Overview of Stochastic Components 





e Normal — continuous, unimodal, symmetric, unbounded 


e Log-normal — continuous, unimodal, skewed, bounded from below by 
zero 


e Bernoulli — discrete, binary outcomes 


e Poisson — discrete, countably infinite on the nonnegative integers 
(for counts) 


= E - 
J 


Gary King (Harvard) The Basics 21 GL 


Overview of Stochastic Components 





e Normal — continuous, unimodal, symmetric, unbounded 

e Log-normal — continuous, unimodal, skewed, bounded from below by 
zero 

e Bernoulli — discrete, binary outcomes 


e Poisson — discrete, countably infinite on the nonnegative integers 
(for counts) 


ee 
ofl 
Ius... 


Gary King (Harvard) The Basics 21 GL 





Choosing systematic and stochastic components 








Gary King (Harvard) The Basics 


Choosing systematic and stochastic components 





e |f one is bounded, so is the other 





Gary King (Harvard) The Basics 


Choosing systematic and stochastic components 





e |f one is bounded, so is the other 


e |f the stochastic component is bounded, the systematic component 
must be globally nonlinear (tho possibly locally linear) 


E = 
CH = 


Gary King (Harvard) The Basics 22. GL 


Choosing systematic and stochastic components 





e If one is bounded, so is the other 

e |f the stochastic component is bounded, the systematic component 
must be globally nonlinear (tho possibly locally linear) 

e All modeling decisions are about the data generation process — how 
the information made its way from the world (including how the world 
produced the data) to your data set 
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the information made its way from the world (including how the world 
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Choosing systematic and stochastic components 





e |f one is bounded, so is the other 


e If the stochastic component is bounded, the systematic component 
must be globally nonlinear (tho possibly locally linear) 


e All modeling decisions are about the data generation process — how 
the information made its way from the world (including how the world 
produced the data) to your data set 

e What if we don't know the DGP (& we usually don't)? 

e [he problem: model dependence 
e Our first approach: make "reasonable" assumptions and check fit (& 
other observable implications of the assumptions) 
e Later: 
e Generalize model: relax assumptions (functional form, distribution, etc) 


e Detect model dependence 
e Ameliorate model dependence: preprocess data (via matching, etc.) 
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3 axioms define the function Pr(-|-): 

@ Pr(z) > 0 for some event z 

@ Pr(sample space) = 1 
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Probability as a Model of Uncertainty 





Pr(y|M) = Pr(data|Model), where M = (f,g,X,6,a). 
e 3 axioms define the function Pr(-|-): 

@ Pr(z) > 0 for some event z 

@ Pr(sample space) = 1 

Q If zj,..., z; are mutually exclusive events, 


Pr(z U -++ U zk) = Pr(z)- <- -+ Pr(z), 


The first two imply 0 < Pr(z) < 1 
Axioms are not assumptions; they can't be wrong. 


From the axioms come all rules of probability theory. 


Rules can be applied analytically or via simulation. 


Gary King (Harvard) The Basics 23 / 61 


Simulation is used to: 








Gary King (Harvard) The Basics 


Simulation is used to: 





@ solve probability problems 





Gary King (Harvard) The Basics 


Simulation is used to: 





@ solve probability problems 
@ evaluate estimators 





Gary King (Harvard) The Basics 


Simulation is used to: 





@ solve probability problems 
@ evaluate estimators 


Q calculate features of probability densities 





Gary King (Harvard) The Basics 


Simulation is used to: 





@ solve probability problems 
@ evaluate estimators 
@ calculate features of probability densities 


@ transform statistical results into quantities of interest 





Gary King (Harvard) The Basics 


Simulation is used to: 





@ solve probability problems 

@ evaluate estimators 

@ calculate features of probability densities 

@ transform statistical results into quantities of interest 


@ ~ Empirical evidence: students get the right answer far more 
frequently by using simulation than math 
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1. Learn about a population 
by taking a random sample 
from it 


1. Learn about a distribu- 
tion by taking random draws 
from it 
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by taking a random sample tion by taking random draws 
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2. Use the random sample 2. Use the random draws to 
to estimate a feature of the approximate a feature of the 
population distribution 
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Survey Sampling 


Simulation 





1. Learn about a population 
by taking a random sample 


from it 
2. Use the random sample 


to estimate a feature of the 


population 
3. The estimate is arbitrarily 


precise for large n 
4. Example: estimate the 


mean of the population 
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1. Learn about a distribu- 
tion by taking random draws 


from it 
2. Use the random draws to 


approximate a feature of the 
distribution 

3. The approximation is ar- 
bitrarily precise for large M 
4. Example: Approximate 
the mean of the distribution 
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Simulation examples for solving probability problems 
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The Birthday Problem 





Given a room with 24 randomly selected people, what is the probability 
that at least two have the same birthday? 


en = = DAW 
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The Birthday Problem 





Given a room with 24 randomly selected people, what is the probability 
that at least two have the same birthday? 


sims <- 1000 
people <- 24 
alldays <- seq(1, 365, 1) 
sameday <- 0 


E = Dan 
a = = Ja (€ 
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The Birthday Problem 





Given a room with 24 randomly selected people, what is the probability 
that at least two have the same birthday? 


sims <- 1000 

people <- 24 

alldays <- seq(1, 365, 1) 

sameday <- 0 

for (i in 1:sims) { 
room <- sample(alldays, people, replace = TRUE) 
if (length(unique(room)) < people) # same birthday 
sameday <- sameday+1 


$ 


= = = ^80 
ji a = = ) Q ¢ 
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The Birthday Problem 





Given a room with 24 randomly selected people, what is the probability 
that at least two have the same birthday? 


sims <- 1000 

people <- 24 

alldays <- seq(1, 365, 1) 

sameday <- 0 

for (i in 1:sims) { 
room <- sample(alldays, people, replace = TRUE) 
if (length(unique(room)) < people) # same birthday 
sameday <- sameday+1 


y 


cat("Probability of >=2 people having the same birthday:", sameday/sims, "Wn" 


= E - : = £ v 
ji J = = ) Q ¢ 
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The Birthday Problem 





Given a room with 24 randomly selected people, what is the probability 
that at least two have the same birthday? 


sims <- 1000 

people <- 24 

alldays <- seq(1, 365, 1) 

sameday <- 0 

for (i in 1:sims) { 
room <- sample(alldays, people, replace = TRUE) 
if (length(unique(room)) < people) # same birthday 
sameday <- sameday+1 


y 


cat("Probability of >=2 people having the same birthday:", sameday/sims, "Wn" 


Four runs: .538, .550, .547, .524 
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In Let's Make a Deal, Monte Hall offers what is behind one of three doors. Behind a 
random door is a car; behind the other two are goats. You choose one door at random. 
Monte peeks behind the other two doors and opens the one (or one of the two) with the 


goat. He asks whether you'd like to switch your door with the other door that hasn't 
been opened yet. Should you switch? 
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Let's Make a Deal 





In Let's Make a Deal, Monte Hall offers what is behind one of three doors. Behind a 
random door is a car; behind the other two are goats. You choose one door at random. 
Monte peeks behind the other two doors and opens the one (or one of the two) with the 
goat. He asks whether you'd like to switch your door with the other door that hasn't 
been opened yet. Should you switch? 


sims «- 1000 

WinNoSwitch «- O 

WinSwitch «- O 

doors «- c(1, 2, 3) 

for (i in 1:sims) { 
WinDoor «- sample(doors, 1) 
choice «- sample(doors, 1) 


if (WinDoor == choice) # no switch 
WinNoSwitch «- WinNoSwitch * 1 
doorsLeft <- doors[doors != choice] * switch 


if (any(doorsLeft -- WinDoor)) 
WinSwitch <- WinSwitch + 1 
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Let's Make a Deal 





In Let's Make a Deal, Monte Hall offers what is behind one of three doors. Behind a 
random door is a car; behind the other two are goats. You choose one door at random. 
Monte peeks behind the other two doors and opens the one (or one of the two) with the 
goat. He asks whether you'd like to switch your door with the other door that hasn't 
been opened yet. Should you switch? 


sims «- 1000 

WinNoSwitch «- 0 

WinSwitch «- O 

doors «- c(1, 2, 3) 

for (i in 1:sims) { 
WinDoor «- sample(doors, 1) 
choice «- sample(doors, 1) 


if (WinDoor == choice) # no switch 
WinNoSwitch «- WinNoSwitch * 1 
doorsLeft <- doors[doors != choice] # switch 


if (any(doorsLeft == WinDoor) ) 
WinSwitch <- WinSwitch + 1 


} 
cat("Prob(Car | no switch)=", WinNoSwitch/sims, "\n") 
cat("Prob(Car | switch)=", WinSwitch/sims, "\n") 
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Let's Make a Deal 





Pr(car|No Switch)  Pr(car|Switch) 


.324 
.345 
.320 
.327 
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A probability density is a function, P(Y), such that 


@ Sum over all possible Y is 1.0 


e For discrete Y: 57 PLY) =1 


all possible Y 
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What is a Probability Density? 





A probability density is a function, P(Y), such that 


@ Sum over all possible Y is 1.0 
e For discrete Y: J'ai possibey P(Y) = 1 
e For continuous Y: f° P(Y)dY =1 
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What is a Probability Density? 





A probability density is a function, P(Y), such that 


@ Sum over all possible Y is 1.0 
e For discrete Y: 5^ possibey P(Y) = 1 
e For continuous Y: f^^ P(Y)dY =1 


@ P(Y) = 0 for every Y 
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e For both: Pr(a < Y < b) = f? P(Y)dY 
e For discrete: Pr(y) = P(y) 
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Computing Probabilities from Densities 





-4 hs 
AA 


e For both: Pr(a < Y < b) = f? P(Y)dY 
ə For discrete: Pr(y) = P(y) 
e For continuous: Pr(y) = 0 (why?) 
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conceivable value of Y; 
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e The assignment of a probability or probability density to every 
conceivable value of Y; 


e The first principles 


e How to use the final expression (but not necessarily the full derivation) 


E 
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What you should know about every pdf 





e The assignment of a probability or probability density to every 
conceivable value of Y; 


e The first principles 
e How to use the final expression (but not necessarily the full derivation) 
e How to simulate from the density 


Oo = = & 
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What you should know about every pdf 





The assignment of a probability or probability density to every 
conceivable value of Y; 


The first principles 
How to use the final expression (but not necessarily the full derivation) 
How to simulate from the density 


How to compute features of the density such as its “moments” 
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What you should know about every pdf 





The assignment of a probability or probability density to every 
conceivable value of Y; 


The first principles 


How to use the final expression (but not necessarily the full derivation) 


o 

@ 

e How to simulate from the density 

e How to compute features of the density such as its "moments" 
9 


How to verify that the final expression is indeed a proper density 


L = LACY 
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Uniform Density on the interval [0, 1] 
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First Principles about the process that generates Y; is such that 
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Uniform Density on the interval [0, 1] 


First Principles about the process that generates Y; is such that 
e Y; falls in the interval [0,1] with probability 1: [^ P(y)dy =1 
e Pr(Y €(a,b)) = Pr(Y € (c,d)) ifa < b, c < d, andb—a-d-c. 


Ply) 


e Is it a pdf? How do you know? 
e How to simulate? runif(1000) 
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e First principles about the process that generates Yj: 


= = = ^2 0v 


a» 05 
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e First principles about the process that generates Yj: 


e Y; has 2 mutually exclusive outcomes; and 
e [he 2 outcomes are exhaustive 
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Bernoulli pmf 





e First principles about the process that generates Yj: 


e Y; has 2 mutually exclusive outcomes; and 
e [he 2 outcomes are exhaustive 


e In this simple case, we will compute features analytically and by 
simulation. 


E 
a [ar = 
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e First principles about the process that generates Yj: 


e Y; has 2 mutually exclusive outcomes; and 
e [he 2 outcomes are exhaustive 


e In this simple case, we will compute features analytically and by 
simulation. 
e Mathematical expression for the pmf 
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e Y; has 2 mutually exclusive outcomes; and 
e [he 2 outcomes are exhaustive 
e In this simple case, we will compute features analytically and by 
simulation. 
e Mathematical expression for the pmf 
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Bernoulli pmf 





e First principles about the process that generates Yj: 
e Y; has 2 mutually exclusive outcomes; and 
e [he 2 outcomes are exhaustive 
e In this simple case, we will compute features analytically and by 
simulation. 
e Mathematical expression for the pmf 
9 Pr(Y; = 1|7;) = Tj, Pr(Y; = O|v;) =1- Ti 
e The parameter m happens to be interpretable as a probability 
o = Pr(Y; = y[nj)) = 21 (1 nj) 
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Bernoulli pmf 





e First principles about the process that generates Yj: 
e Y; has 2 mutually exclusive outcomes; and 
e [he 2 outcomes are exhaustive 
e In this simple case, we will compute features analytically and by 
simulation. 
e Mathematical expression for the pmf 
Pr(Y; = 1|) = mj, Pr(Y; 20m) 21— mj 
The parameter 7 happens to be interpretable as a probability 
=> Pr(Y; = y|n;i) = T” (1 — m). 
Alternative notation: Pr(Y; = y|n;) = Bernoulli(y|7;) = fa(y\r;) 
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Graphical summary of the Bernoulli 
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Features of the Bernoulli: analytically 
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Features of the Bernoulli: analytically 





e Expected value: 


E(Y)=5_ yP(y) 


all y 


= 0Pr(0) + 1Pr(1) 


= 
e Variance: 


v(Y) = EKY - £(YY'] 
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E(Y) = M yP(y) 
all y 
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= 
e Variance: 


V(Y) = E((Y - E(YY] 
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Features of the Bernoulli: analytically 





e Expected value: 


E(Y) = M yP(y) 


all y 


= OPr(0) + 1 Pr(1) 


= 
e Variance: 


V(Y) = E((Y - E(YY] 
=E(Y*)—£(Y)" 
= E(Y?°)d - 7? 


e How do we compute E(Y?)? 
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Expected values of functions of random variables 
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Expected values of functions of random variables 





Elg(Y)] = So a(y)P(y) 


all y 
or 


Ele = f i g()P() 


For example, 


E(Y?) 2 3 y^P(y) 
all y 
= 0? Pr(0) + 1? Pr(1) 
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Variance of the Bernoulli (uses above results) 
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Variance of the Bernoulli (uses above results) 


V(Y) =E[(Y —E(Y)P?] (The definition) 
=E(Y =E(Yy (An easier version) 
=r- n? 


— «(1— r) 


This makes sense: 


Gary King (Harvard) The Basics 38 / 61 





Variance of the Bernoulli (uses above results) 


V(Y) =E[(Y —E(Y)P?] (The definition) 
= E(Y°) - E(Yy* (An easier version) 


=r -r° 
=q(1-— n) 
This makes sense: 
25 
Var. 
0 5 1 
Mean 
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How to Simulate from the Bernoulli with parameter 7 
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How to Simulate from the Bernoulli with parameter 7 





ə take one draw u from a uniform density on the interval [0,1] 
e Set 7 to a particular value 

e Set y = 1 if u < m and y = 0 otherwise 

e In R: 
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How to Simulate from the Bernoulli with parameter 7 





ə take one draw u from a uniform density on the interval [0,1] 
e Set 7 to a particular value 
e Set y = 1 if u < m and y = 0 otherwise 


e |n R: 
sims «- 1000 # set parameters 
bernpi <- 0.2 
u <- runif (sims) # uniform sims 
y <- as.integer(u < bernpi) 
y # print results 


c = = = Ae 
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How to Simulate from the Bernoulli with parameter 7 





ə take one draw u from a uniform density on the interval [0,1] 
e Set 7 to a particular value 
e Set y = 1 if u < m and y = 0 otherwise 


e |n R: 
sims «- 1000 # set parameters 
bernpi <- 0.2 
u <- runif (sims) # uniform sims 
y <- as.integer(u < bernpi) 
y # print results 


Running the program gives: 


00010011001110... 
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How to Simulate from the Bernoulli with parameter 7 





ə take one draw u from a uniform density on the interval [0,1] 
e Set 7 to a particular value 
e Set y = 1 if u < m and y = 0 otherwise 


e |n R: 
sims «- 1000 # set parameters 
bernpi <- 0.2 
u <- runif (sims) # uniform sims 
y <- as.integer(u < bernpi) 
y # print results 


Running the program gives: 
00010011001 110... 


e What can we do with the simulations? 


L = FACY 
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e The trials are independent 
e The trials are identically distributed 
e We observe Y = 37^ , y; 
Density: 
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Binomial Distribution 





First principles: 
e N iid Bernoulli trials, y1,..., yw 
e The trials are independent 
e The trials are identically distributed 
e We observe Y = 37^ , y; 
Density: 
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Explanation: 
o (2) because (1 0 1) and (1 1 0) are both y — 2. 
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Binomial Distribution 





First principles: 
e N iid Bernoulli trials, y1,..., yw 
e The trials are independent 
e The trials are identically distributed 
e We observe Y = 37^ , y; 
Density: 


N 
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Explanation: 


o M because (1 0 1) and (1 1 0) are both y — 2. 


e 7” because y successes with 7 probability each (product taken due to 
independence) 
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First principles: 
e N iid Bernoulli trials, y1,..., yw 
e The trials are independent 
e The trials are identically distributed 
e We observe Y = 37^ , y; 
Density: 


P(Y =yln) = (7) (1 — c) 


Explanation: 
o (2) because (1 0 1) and (1 1 0) are both y — 2. 


e 7” because y successes with m probability each (product taken due to 
independence) 


e (1— x)" * because N — y failures with 1 — 7 probability each 
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First principles: 
e N iid Bernoulli trials, y1,..., yw 
e The trials are independent 
e The trials are identically distributed 
e We observe Y = 37^ , y; 
Density: 


P(Y = yx) = (7) (1 — c) 


Explanation: 


o (2) because (1 0 1) and (1 1 0) are both y — 2. 
e 7” because y successes with 7 probability each (product taken due to 


independence) 
e (1— x)" * because N — y failures with 1 — 7 probability each 
e Mean E(Y) = Na 
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Binomial Distribution 





First principles: 
e N iid Bernoulli trials, y1,..., yw 
e The trials are independent 
e The trials are identically distributed 
e We observe Y = 37^ , y; 
Density: 


P(Y = yx) = (7) (1 — c) 


Explanation: 
o (2) because (1 0 1) and (1 1 0) are both y — 2. 


e 7” because y successes with 7 probability each (product taken due to 
independence) 


e (1— x)" * because N — y failures with 1 — 7 probability each 
e Mean E(Y) = Na 
e Variance V(Y) = a(1 — «)/N. 


n 3 ) © 
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How to simulate from the Binomial distribution 
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e To simulate from the Binomial(z; N): 


e Simulate N independent Bernoulli variables, Y;,..., Yy, each with 
parameter 7 
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How to simulate from the Binomial distribution 





e To simulate from the Binomial(z; N): 
e Simulate N independent Bernoulli variables, Y;,..., Yn, each with 


parameter 7 
e Add them up: RR Y; 


en E = ^ac 
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How to simulate from the Binomial distribution 





e To simulate from the Binomial(z; N): 


e Simulate N independent Bernoulli variables, Y;,..., Yy, each with 
parameter 7 
e Add them up: 354 Y; 


e What can you do with the simulations? 


E = ^an 
e z = FACY 
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Where to get uniform random numbers 
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e Random is not haphazard (e.g., Benford's law) 
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e We use pseudo-random numbers which have (a) digits that occur 
with 1/10th probability, (b) no time series patterns, etc. 
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e Random is not haphazard (e.g., Benford's law) 
e Random number generators are perfectly predictable (what?) 


e We use pseudo-random numbers which have (a) digits that occur 
with 1/10th probability, (b) no time series patterns, etc. 


e How to create real random numbers? 


c = = = Hae 
Gary King (Harvard) The Basics 42 / 61 


Where to get uniform random numbers 





Random is not haphazard (e.g., Benford's law) 


Random number generators are perfectly predictable (what?) 


We use pseudo-random numbers which have (a) digits that occur 
with 1/10th probability, (b) no time series patterns, etc. 


How to create real random numbers? 


Some chips now use quantum effects 


c = = = ae 
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Discretization for random draws from discrete pmfs 
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Discretization Inverse CDF 
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Ply) cdf(Y) 
0 

y 


-27 9,214 
y 


= = = = ^20 0v 


a a 


Gary King (Harvard) The Basics 43 / 61 


Discretization for random draws from discrete pmfs 
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Discretization for random draws from discrete pmfs 





Discretization Inverse CDF 
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ə Divide up PDF into a grid 
e Approximate probabilities by trapezoids 
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Discretization Inverse CDF 
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e Divide up PDF into a grid 
9 Approximate probabilities by trapezoids 


e Map [0,1] uniform draws to the proportion area in each trapezoid 
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Discretization for random draws from discrete pmfs 





Discretization Inverse CDF 


Ply) cdf(Y) 


-27 9,214 


e Divide up PDF into a grid 
e Approximate probabilities by trapezoids 
e Map [0,1] uniform draws to the proportion area in each trapezoid 


e Return midpoint of each trapezoid 


= = 
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Discretization for random draws from discrete pmfs 





Discretization Inverse CDF 


Ply) cdf(Y) 


-27 9,214 


Divide up PDF into a grid 
Approximate probabilities by trapezoids 
Map [0,1] uniform draws to the proportion area in each trapezoid 


Return midpoint of each trapezoid 


More trapezoids ~> better approximation 
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Discretization for random draws from discrete pmfs 





Discretization Inverse CDF 


Ply) cdf(Y) 


-27 9,214 


Divide up PDF into a grid 

Approximate probabilities by trapezoids 

Map [0,1] uniform draws to the proportion area in each trapezoid 
Return midpoint of each trapezoid 

More trapezoids ~> better approximation 


(Works for a few dimensions, but Infeasible for many) 


r = Ho 
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Inverse CDF: drawing from arbitrary continuous pdfs 
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Discretization Inverse CDF 


Ply) cdf(Y) 





-27 9,214 


e From the pdf f(Y), compute the cdf: 
P(Y € y) 8 F(y - Ja f(z)dz 
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e From the pdf f(Y), compute the cdf: 
P(Y € y) 8 F(y - IL. f(z)dz 
e Define the inverse cdf F-1(y), such that F—1[F(y)] = y 
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Inverse CDF: drawing from arbitrary continuous pdfs 


Discretization Inverse CDF 


Ply) caf(Y) 
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e From the pdf f(Y), compute the cdf: 
P(Y € y) 8 F(y) IE f(z)dz 
e Define the inverse cdf F-1(y), such that F—1[F(y)] = y 


e Draw random uniform number, U 
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Inverse CDF: drawing from arbitrary continuous pdfs 





Discretization Inverse CDF 


Ply) caf(Y) 





-27 9,214 


From the pdf f(Y), compute the cdf: 

P(Y < y) = F(y) = J", f(z)dz 

Define the inverse cdf F~1(y), such that F-1[F(y)] = y 
Draw random uniform number, U 

Then F-!(U) gives a random draw from f(Y). 
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Using Inverse CDF to Improve Discretization Method 
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Using Inverse CDF to Improve Discretization Method 








Discretization Inverse CDF 
1 
Ply) caf(Y) 
£ 9,214 
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e Refined Discretization Method: 
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Using Inverse CDF to Improve Discretization Method 


Discretization Inverse CDF 
1 
Ply) caf(Y) 
£ 9,214 
y -27 y » 








e Refined Discretization Method: 
e Choose interval randomly as above (based on area in trapezoids) 
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Using Inverse CDF to Improve Discretization Method 





Discretization Inverse CDF 


Ply) cdf(Y) 





-27 9,214 


e Refined Discretization Method: 


e Choose interval randomly as above (based on area in trapezoids) 
e Draw a number within each trapeaoid by the inverse CDF method 
applied to the trapezoidal approximation. 
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Using Inverse CDF to Improve Discretization Method 





Discretization Inverse CDF 


Ply) cdf(Y) 





-27 9,214 


@ Refined Discretization Method: 
e Choose interval randomly as above (based on area in trapezoids) 
e Draw a number within each trapeaoid by the inverse CDF method 
applied to the trapezoidal approximation. 
e Drawing random numbers from arbitrary multivariate densities: now 
an enormous literature 


Gary King (Harvard) The Basics 45 / 61 


Normal Distribution 








Gary King (Harvard) The Basics 


Normal Distribution 





e Many different first principles 
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Normal Distribution 





e Many different first principles 
e A common one is the central limit theorem 
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Normal Distribution 





e Many different first principles 
e A common one is the central limit theorem 
e The univariate normal density (with mean ji;, variance c?) 
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e Many different first principles 
e A common one is the central limit theorem 
e The univariate normal density (with mean j/;, variance c?) 
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Normal Distribution 





e Many different first principles 
e A common one is the central limit theorem 
e The univariate normal density (with mean j/;, variance c?) 


A ws cda 
N(yi|u;, a?) = (210?) 1? exp (“45 ) 


e The stylized normal: fetn(yilii) = N(y|ui, 1) 
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Normal Distribution 





e Many different first principles 
e A common one is the central limit theorem 
e The univariate normal density (with mean j/;, variance o?) 
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e The stylized normal: fstn(yilui) = N(y|ui, 1) 
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Normal Distribution 





e Many different first principles 
e A common one is the central limit theorem 
e The univariate normal density (with mean j/;, variance o?) 


2 21—1/2 -(vi = bi)? 
N(yilui, o^) = (270°) P? exp (25) 
e The stylized normal: fstn(yilui) = N(y| ui, 1) 
sf yc 
fen(y li) = Qn) 1 exp — 


e The standardized normal: fon(yi) = N(yi|9, 1) = (yi) 
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Normal Distribution 





e Many different first principles 
e A common one is the central limit theorem 
e The univariate normal density (with mean j/;, variance o?) 


vs 
N(y;|u;,o?) = (210?)-17? exp (25) 


202 


e The stylized normal: fetn(yiliui) = N(y|ui, 1) 


(wg 
toi) = Qr) exp (G8) 


e The standardized normal: f, (yi) = N(yi|9, 1) = (yi) 


fon(yi) = Qn) 1? exp (=) 
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Multivariate Normal Distribution 
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e Let Y; = {Y1;,..., Yki} be a k x 1 vector, jointly random: 





Gary King (Harvard) The Basics 
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e Let Y; = {Y1;,..., Yki} be a k x 1 vector, jointly random: 


Y; ~ N(yilui, £) 
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Multivariate Normal Distribution 





e Let Y; = {Y1;,..., Yki} be a k x 1 vector, jointly random: 
Y;  N(yi|u;, ©) 


where u; is k x 1 and È is k x k. For k — 2, 
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e Let Y; = (Yij,..., Yu) be a k x 1 vector, jointly random: 
Y; ~ N(yi|u;, ©) 


where u; is k x 1 and È is k x k. For k = 2, 
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e Let Y; = (Yij,..., Yu) be a k x 1 vector, jointly random: 


Y; ~ N(yi|ui, £) 


where u; is k x 1 and È is k x k. For k = 2, 


e Mathematical form: 
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Multivariate Normal Distribution 





e Let Y; = (Yij,..., Yki} be a k x 1 vector, jointly random: 
Y; ~ N(yi|u;, ©) 


where u; is k x 1 and È is k x k. For k = 2, 
2 
Hii Oy 012 
i C 4) 


e Mathematical form: 


z 2 1 = 
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Multivariate Normal Distribution 





e Let Y; = (Yij,..., Yki} be a k x 1 vector, jointly random: 


Y; ~ N(yi|ui, Y-) 


where u; is k x 1 and È is k x k. For k = 2, 


e Mathematical form: 


z 2 1 = 
Nola E) = Qn) "Iz exp |- 56i = YE = n) 


e Simulating once from this density produces k numbers. Special 
algorithms are used to generate normal random variates (in R, 


mvrnorm(), from the MASS library). 
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Multivariate Normal Distribution 





e Moments: E(Y) = wi, V(Y) = È, Cov(Y1, Y2) = 012 = 021. 
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Multivariate Normal Distribution 





e Moments: E(Y) = wi, V(Y) = È, Cow( Y1, Y2) = 012 = 021. 
e Corr( Y1, Yo) = 2% 
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Multivariate Normal Distribution 





e Moments: E(Y) = wi, V(Y) = È, Cov(Y3, Y2) = 012 = 021. 
e Corr(Yi, Y) = 22 
e Marginals: 


oo oo 
N(Yalu1, o) = J - li N(yilten E)dyadys -- - dyr 
—oo —oo 
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Truncated bivariate 





normal examples (for 8^ and 6”) 


LM 


24 68 


lok 
NAAN 
A BRON 
BERS 
es 


o 


Qo: he 












os 
o6 







p 
we py PE 
o 2 b f a 

oo @ B 


an : 
o Up D 





os 
oo 
o: 


‘2 x b 
py © pi 
(a) 0.5 0.5 0.15 0.15 0 


(b) 0.1 0.9 0.15 0.15 0 (c) 0.8 0.8 0.6 0.6 0.5 


Parameters are u1, H2, 01, 02, and p. 
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We will stop here this year and skip to the next set of slides. 
Please refer to the slides below for further information on probability 
densities and random number generation; they offer more sophisticated . 


= = = = 
Oo = = 
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e Used to model proportions. 


e We'll use it first to generalize the Binomial distribution 


- = = = Daw 


E 
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Beta (continuous) density 





e Used to model proportions. 
e We'll use it first to generalize the Binomial distribution 


@ y falls in the interval [0,1] 


en = = ^ac 


Gary King (Harvard) The Basics 51/61 


Beta (continuous) density 





e Used to model proportions. 
e We'll use it first to generalize the Binomial distribution 
@ y falls in the interval [0,1] 


ə Takes on a variety of flexible forms, depending on the parameter 
values: 


E = 
Cr = 


Gary King (Harvard) The Basics 51/61 


Beta (continuous) density 





e@ Used to model proportions. 
e We'll use it first to generalize the Binomial distribution 
ə y falls in the interval [0,1] 


e Takes on a variety of flexible forms, depending on the parameter 
values: 


Ply) 
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[ (o T B) a-1 


ray 5» 


Beta(y|a, 8) = 
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F(a F B) a—1 
F(a)F(8)" 


where, l'(x) is the gamma function: 


Beta(y|a, 8) = (1- yy 
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Beta(yla,3) = rtg 1 — y) 


where, l'(x) is the gamma function: 


oo 
rx) = f z* 1e7dz 
0 
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Beta(yla, B) = rentis y 


where, l'(x) is the gamma function: 


oo 
rx) = f z* 1e7dz 
0 


For integer values of x, F(x + 1) 2 x! = x(x — 1)(x — 2)---1. 
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Beta(yla, B) = rentis y 


where, l'(x) is the gamma function: 


oo 
hd e n z* le'*gz 
0 
For integer values of x, F(x + 1) = x! = x(x — 1)(x — 2)---1. 


Non-integer values of x produce a continuous interpolation. In R or gauss: 
gamma (x) ; 
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[ (o E B) ye nd = yet 


l'(a)F(8) 


where, l'(x) is the gamma function: 


Beta(y|a, 8) = 


oo 
hd e n z* le'*gz 
0 
For integer values of x, F(x + 1) = x! = x(x — 1)(x — 2)---1. 


Non-integer values of x produce a continuous interpolation. In R or gauss: 
gamma (x); 


Intuitive? 
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where, l'(x) is the gamma function: 


Beta(y|a, 8) = 


oo 
hd e n z* le'*gz 
0 
For integer values of x, F(x + 1) = x! = x(x — 1)(x — 2)---1. 


Non-integer values of x produce a continuous interpolation. In R or gauss: 
gamma (x) ; 


Intuitive? The moments help some: 
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where, l'(x) is the gamma function: 


Beta(y|a, 8) = 
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0 
For integer values of x, F(x + 1) = x! = x(x — 1)(x — 2)---1. 


Non-integer values of x produce a continuous interpolation. In R or gauss: 
gamma (x) ; 


Intuitive? The moments help some: 
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Standard Parameterization 





[ (o E B) ye nd = yet 


l'(a)F(8) 


where, l'(x) is the gamma function: 


Beta(y|a, 8) = 


oo 
hd e n z* le'*gz 
0 
For integer values of x, F(x + 1) = x! = x(x — 1)(x — 2)---1. 


Non-integer values of x produce a continuous interpolation. In R or gauss: 
gamma (x) ; 


Intuitive? The moments help some: 
E(Y) = eim 
E B 
v(Y) = (areata) 
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Alternative parameterization 








Set u = E(Y) = t 


a B =. = = Dae 


Gary King (Harvard) The Basics 53 / 61 


Alternative parameterization 





d u(1— uy =V(Y)= 


Set u = E(Y) = CET and FIFY) 





ap 
(a+b)? (a+6+1)' 
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Set u = E(Y) = Gy and men = =V(Y)= 


and 8 and substitute in. 


solve for a 





aß 
(a+6)?(a+6+1)' 
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Alternative parameterization 








- OE u(l-u)y _ 
Set p — E(Y) = (a+) and “Gry 
and 8 and substitute in. 
Result: 
Gary King (Harvard) The Basics 
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aß 
(a+8)?(a+8+1)' 


solve for a 
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Alternative parameterization 





Set u = E(Y) 


Result: 


beta(y|u, y) = 
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= ag and e = v(Y)- 
and 8 and substitute in. 


(1+7) 


ray t+ C=) at 





Tua ido 


The Basics 


1 


ryt 


B 
GXEPGARD: solve for a 
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Alternative parameterization 





Set u = E(Y) = 
and B and substitute in. 


Result: 


beta(y|u, y) = 


where now E(Y) 
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1 
(a8) and "p = =V(Y)= ET solve for o 


F(uy - (17 9v) pracy 
F (uy 1)F[Q. — a)y] 


1 





= u and y is an index of variation that varies with ju. 
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Alternative parameterization 








1 
Set = E(Y) = CEA and n = =V(Y)= GXEYCGEND. solve for a 
and 8 and substitute in. 
Result: 


F(uy - (17 9v) m 
F (uy) — a)y] 


where now E(Y) = u and y is an index of variation that varies with p. 


1 





beta(y|u, y) = 


Reparameterization like this will be key throughout the course. 


ji E z DELIG 
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Beta-Binomial 





ul 





Gary King (Harvard) The Basics 


Beta-Binomial 





Useful if the binomial variance is not approximately 7(1 — 7)/N. 


= = = Daw 


o> o5 
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Beta-Binomial 





Useful if the binomial variance is not approximately 7(1 — 7)/N. 


How to simulate 
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Beta-Binomial 





Useful if the binomial variance is not approximately «(1 — 7)/N. 


How to simulate 


(First principles are easy to see from this too.) 


= s = = = 
a = = 


QC 
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Beta-Binomial 





Useful if the binomial variance is not approximately «(1 — 7)/N. 


How to simulate 


(First principles are easy to see from this too.) 


e Begin with N Bernoulli trials with parameter aj, j = 1,..., N (not 
necessarily independent or identically distributed) 
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Beta-Binomial 





Useful if the binomial variance is not approximately 7(1 — 7)/N. 
How to simulate 


(First principles are easy to see from this too.) 


e Begin with N Bernoulli trials with parameter mj, j = 1,..., N (not 
necessarily independent or identically distributed) 


e Choose u = E(mj) and y 


c - = = Hac 
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Useful if the binomial variance is not approximately 7(1 — 7)/N. 
How to simulate 


(First principles are easy to see from this too.) 


e Begin with N Bernoulli trials with parameter aj, j = 1,..., N (not 
necessarily independent or identically distributed) 


e Choose u = E(mj) and y 


e Draw 7t from Beta(r|u, y) (without this step we get Binomial draws) 
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Beta-Binomial 





Useful if the binomial variance is not approximately 7(1 — 7)/N. 
How to simulate 


(First principles are easy to see from this too.) 


e Begin with N Bernoulli trials with parameter aj, j = 1,..., N (not 
necessarily independent or identically distributed) 


e Choose u = E(mj) and y 
e Draw 7t from Beta(r|u, y) (without this step we get Binomial draws) 
ə Draw N Bernoulli variables Žž; (j = 1,..., N) from Bernoulli(zj|7) 
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Beta-Binomial 





Useful if the binomial variance is not approximately 7(1 — 7)/N. 
How to simulate 


(First principles are easy to see from this too.) 


e Begin with N Bernoulli trials with parameter mj, j = 1,..., N (not 
necessarily independent or identically distributed) 


e Choose u = E(mj) and y 
e Draw 7t from Beta(r|u, y) (without this step we get Binomial draws) 
ə Draw N Bernoulli variables Žž; (j = 1,..., N) from Bernoulli(zj|7) 


e Add up the Z's to get y — Da which is a draw from the 
beta-binomial. 
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Beta-Binomial Analytics 





a oF = 2 = NAQ 
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Recall: 


a [a] = 2 = NAQ 
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Beta-Binomial Analytics 





Recall: 


Pr(AB) 
Pr(B) 





Pr(A|B) = —. Pr(AB) = Pr(A|B) Pr(B) 


a a = = = MAG 
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Beta-Binomial Analytics 





Recall: 


Pr(AB) 
Pr(B) 





Pr(A|B) = —. Pr(AB) = Pr(A|B) Pr(B) 


Plan: 


= = = = ^0. 0v 


a a 
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Beta-Binomial Analytics 





Recall: 


Pr(AB) 


Pr(AlB) = “FB 





=> Pr(AB) = Pr(A|B) Pr(B) 
Plan: 


@ Derive the joint density of y and m. Then 


= = = ^a 0v 


a a 
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Beta-Binomial Analytics 





Recall: 


Pr(AB) 
Pr(B) 





Pr(A|B) = —. Pr(AB) = Pr(A|B) Pr(B) 


Plan: 


@ Derive the joint density of y and 7. Then 


9 Average over the unknown 7 dimension 


a = = = Daw 


E 
a fa! 
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Plan: 


@ Derive the joint density of y and 7. Then 


9 Average over the unknown 7 dimension 
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Beta-Binomial Analytics 





Recall: 


Pr(AB) 


Pr(AlB) = “FB 





=> Pr(AB) = Pr(A|B) Pr(B) 
Plan: 


@ Derive the joint density of y and m. Then 


9 Average over the unknown 7 dimension 


Hence, the beta-binomial (or extended beta-binomial): 


a a = = = NAS 
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Beta-Binomial Analytics 





Recall: 


Pr(AB) 


Pr(AlB) = “FB 





=> Pr(AB) = Pr(A|B) Pr(B) 


Plan: 


@ Derive the joint density of y and m. Then 


9 Average over the unknown 7 dimension 
Hence, the beta-binomial (or extended beta-binomial): 


1 
BB(y;|u, y) =| Binomial(y;|7) x Beta(r|u, y)dx 
0 


a E E = = DOW 
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Beta-Binomial Analytics 





Recall: 


Pr(AB) 


Pr(AlB) = “FB 





=> Pr(AB) = Pr(A|B) Pr(B) 


Plan: 


@ Derive the joint density of y and m. Then 


9 Average over the unknown 7 dimension 
Hence, the beta-binomial (or extended beta-binomial): 
1 
BB(y;|u, y) =| Binomial(y;|7) x Beta(z|, y)dx 
0 


qd 
= f Po nli Son 
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a E E = = DOW 
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Beta-Binomial Analytics 





Recall: 


xia 


Pr(AIB) = SB) 





=> Pr(AB) = Pr(A|B) Pr(B) 


Plan: 


@ Derive the joint density of y and m. Then 


9 Average over the unknown 7 dimension 
Hence, the beta-binomial (or extended beta-binomial): 
1 
BB(y;|u, y) =| Binomial(y;|7) x Beta(z|u, y)dr 
0 


T 
= f Pons ndr 
0 
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ə Begin with an observation period: 
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Poisson Distribution 





e Begin with an observation period: 


e All assumptions are about the events that occur between the start 
and when we observe the count. The process of event generation is 
assumed not observed. 


E = = 
Oo = 
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Poisson Distribution 





e Begin with an observation period: 


e All assumptions are about the events that occur between the start 
and when we observe the count. The process of event generation is 
assumed not observed. 


ə 0 events occur at the start of the period 


= =, = 
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Poisson Distribution 





e Begin with an observation period: 


0 pea ay 
Observation period 


e All assumptions are about the events that occur between the start 
and when we observe the count. The process of event generation is 
assumed not observed. 


ə 0 events occur at the start of the period 


e Only observe number of events at the end of the period 
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Poisson Distribution 





e Begin with an observation period: 


0 ——— ———— | 
Observation period 


e All assumptions are about the events that occur between the start 
and when we observe the count. The process of event generation is 
assumed not observed. 


ə 0 events occur at the start of the period 
e Only observe number of events at the end of the period 


e No 2 events can occur at the same time 
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Poisson Distribution 





e Begin with an observation period: 


| 
\ 


4 E ovn wai 
Observation period Count observed here 


All assumptions are about the events that occur between the start 
and when we observe the count. The process of event generation is 
assumed not observed. 


ə 0 events occur at the start of the period 

e Only observe number of events at the end of the period 
e No 2 events can occur at the same time 
@ 


Pr(event at time t | all events up to time t — 1) is constant for all t. 
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Poisson Distribution 





e First principles imply: 
9 


e^ Xi 


Poisson(y|A) — l^ otherwise 


for y; = 0,1,... 
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e First principles imply: 
9 


e^ Xi 


Poisson(y|A) = l^ otherwise 


for y; = 0,1,... 


e E(Y)2 A 


= = = 2 Daw 


a 5 
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Poisson Distribution 





e First principles imply: 


(J 
e Nt for y; 20.1 
* A Yi m ? "k 
Poisson(y|\) = yi! 
(vid) i otherwise 
e E(Y)=A 
e V(Y)2A 


= = = 2 Daw 


a e 
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Poisson Distribution 





e First principles imply: 


9 
Poisson(y|A) = 4 5! or yi USE 
0 otherwise 
e That the variance goes up with the mean makes sense, but should they 
be equal? 


o = = = = Daw 


Oo = 
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Poisson Distribution 





e First principles imply: 


9 
e^ Ni EE 
Poisson(y|A) — h^ for y H Sighs 
0 otherwise 
e E(Y)=A 
e V(Y)2A 
e That the variance goes up with the mean makes sense, but should they 
be equal? 








o —-wo»u 
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= = = = oan 
a = = Ja ( 
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ə |f we assume Poisson dispersion, but Y|X is over-dispersed, standard 
errors are too small. 


3 = = = Dae 
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ə |f we assume Poisson dispersion, but Y|X is over-dispersed, standard 


errors are too small. 
If we assume Poisson dispersion, but Y|X is under-dispersed, standard 


errors are too large. 
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Poisson Distribution 





e |f we assume Poisson dispersion, but Y|X is over-dispersed, standard 


errors are too small. 
If we assume Poisson dispersion, but Y|X is under-dispersed, standard 


errors are too large. 
e How to simulate? We'll use canned random number generators. 


E 
a [ar 
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ə Used to model durations and other nonnegative variables 
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e Used to model durations and other nonnegative variables 


e We'll use first to generalize the Poisson 


- = = 2 NAQ 


= 
a fa! 


Gary King (Harvard) The Basics 59 / 61 





ə Used to model durations and other nonnegative variables 
e We'll use first to generalize the Poisson 


ə Parameters: ¢ > 0 is the mean and c? > 1 is an index of variability. 


E Dae 
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ə Used to model durations and other nonnegative variables 

e We'll use first to generalize the Poisson 

ə Parameters: ¢ > 0 is the mean and c? > 1 is an index of variability. 
e Moments: mean E(Y) = ¢ > 0 and 
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ə Used to model durations and other nonnegative variables 

e We'll use first to generalize the Poisson 

e Parameters: ¢ > 0 is the mean and c? > 1 is an index of variability. 
e Moments: mean E(Y) = ¢ > 0 and variance V(Y) = o(o? — 1) 
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ə Used to model durations and other nonnegative variables 

e We'll use first to generalize the Poisson 

ə Parameters: ¢ > 0 is the mean and c? > 1 is an index of variability. 

e Moments: mean E(Y) = ¢ > 0 and variance V(Y) = ¢(a? — 1) 
yf -07-1e-y(e? 1) 

Foo? - 1) ye? - 9977 





gamma(y|ó. o^) = 


Gary King (Harvard) The Basics 59 / 61 


Gary King (Harvard) 


Negative Binomial 


The Basics 


ul 





60 / 61 


Negative Binomial 





e Same logic as the beta-binomial generalization of the binomial 
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e Same logic as the beta-binomial generalization of the binomial 


e Parameters ¢ > 0 and dispersion parameter c? > 1 


E Qa 
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e Same logic as the beta-binomial generalization of the binomial 
e Parameters ¢ > 0 and dispersion parameter c? > 1 
e Moments: mean E(Y) = ¢ > 0 and 
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e Same logic as the beta-binomial generalization of the binomial 
e Parameters ¢ > 0 and dispersion parameter c? > 1 
e Moments: mean E(Y) = ¢ > 0 and variance V(Y) = 07 
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e Same logic as the beta-binomial generalization of the binomial 
e Parameters ¢ > 0 and dispersion parameter c? > 1 

e Moments: mean E(Y) = ¢ > 0 and variance V(Y) = 07¢ 

e Allows over-dispersion: V(Y) > E(Y). 
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e Same logic as the beta-binomial generalization of the binomial 

e Parameters ¢ > 0 and dispersion parameter c? > 1 

e Moments: mean E(Y) = ¢ > 0 and variance V(Y) = a?ó 

e Allows over-dispersion: V(Y) > E(Y). 

e As c? — 1, NegBin(y|¢, o?) — Poisson(y|ó) (i.e., small c? makes 
the variation from the gamma vanish) 
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e Same logic as the beta-binomial generalization of the binomial 

e Parameters ¢ > 0 and dispersion parameter c? > 1 

e Moments: mean E(Y) = ¢ > 0 and variance V(Y) = a?ó 

e Allows over-dispersion: V(Y) > E(Y). 

e As c? — 1, NegBin(y|¢, o?) — Poisson(y|¢) (i.e., small c? makes 
the variation from the gamma vanish) 


How to simulate (and first principles) 
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Same logic as the beta-binomial generalization of the binomial 
Parameters à > 0 and dispersion parameter o? > 1 

Moments: mean E(Y) = ó > 0 and variance V(Y) = o?ó 

Allows over-dispersion: V(Y) > E(Y). 

As c? — 1, NegBin(y|¢, o?) — Poisson(y|ó) (i.e., small o? makes 
the variation from the gamma vanish) 


How to simulate (and first principles) 
e Choose E(Y) = ¢ and c? 
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Same logic as the beta-binomial generalization of the binomial 
Parameters à > 0 and dispersion parameter o? > 1 

Moments: mean E(Y) = ó > 0 and variance V(Y) = o?ó 

Allows over-dispersion: V(Y) > E(Y). 

As c? — 1, NegBin(y|¢, o?) — Poisson(y|ó) (i.e., small o? makes 
the variation from the gamma vanish) 


How to simulate (and first principles) 
e Choose E(Y) = ¢ and c? 
e Draw À from gamma(A|ó, o°). 
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Same logic as the beta-binomial generalization of the binomial 
Parameters à > 0 and dispersion parameter o? > 1 

Moments: mean E(Y) = ó > 0 and variance V(Y) = o?ó 

Allows over-dispersion: V(Y) > E(Y). 

As c? — 1, NegBin(y|ó, o?) — Poisson(y|ó) (i.e., small o? makes 
the variation from the gamma vanish) 


How to simulate (and first principles) 
e Choose E(Y) = ¢ and c? 
e Draw À from gamma(A|ó, c?). 


e Draw Y from Poisson(y|A), which gives one draw from the negative 
binomial. 
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Recall: 





_ Pr(AB) 


Pr(AIB) = Sn 


=> Pr(AB) = Pr(A|B)Pr(B) 
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oo 
NegBin(y|¢, o?) m Poisson(y|A) x gamma(A|¢, o?)dÀ 
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Recall: 
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Pr(AIB) = sn 





=> Pr(AB) = Pr(A|B)Pr(B) 


oo 
NegBin(y|¢, o?) =] Poisson(y|A) x gamma(A|¢, o°)dÀ 
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